minimax approach
A Minimax Approach to Supervised Learning
Given a task of predicting Y from X, a loss function L, and a set of probability distributions Gamma on (X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Gamma? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models as well as some popular regularization schemes. For quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models. Moreover, for the 0-1 loss we derive a classifier which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed Gamma by solving a tractable optimization problem. We perform several numerical experiments to show the power of the minimax SVM in outperforming the SVM.
Theoretical Analysis of Adversarial Learning: A Minimax Approach
In this paper, we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be transformed into a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification and popular loss functions including the hinge loss and ramp loss. As some illustrative examples, we derive the adversarial risk bounds for SVMs and deep neural networks, and our bounds have two data-dependent terms, which can be optimized for achieving adversarial robustness.
A Minimax Approach to Supervised Learning
Given a task of predicting Y from X, a loss function L, and a set of probability distributions Gamma on (X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Gamma? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models as well as some popular regularization schemes. For quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models. Moreover, for the 0-1 loss we derive a classifier which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed Gamma by solving a tractable optimization problem. We perform several numerical experiments to show the power of the minimax SVM in outperforming the SVM.
Reviews: Theoretical Analysis of Adversarial Learning: A Minimax Approach
Originality: I find the approach original and interesting, I find that other works have been cited and the section of related work is written clearly and detailed, it gives a nice overview. I think only that it is important to highlight more clearly the differences between [40] and the current work. In particular, it is unclear what is the penalty parameter, and how their method of adversarial training relates to this work - do they optimize a different bound or what quantities do they optimize, and do these quantities show up in the proposed bound? Quality: the work seems complete, and sound for as far as I could check. I could not check all the proofs in detail but I read the work in great detail.
Reviews: A Minimax Approach to Supervised Learning
The technical results appear to be correct and the experimental results (which I think are quite preliminary) suggest the minimax SVM might be a good idea. I think the idea of robust Bayes decision rules makes sense and the authors show how under squared loss a connection to the Huber loss emerges. My main comment is that the paper itself is a somewhat difficult read due to terseness at key places, which might limit the impact of the paper. So, the rest of my comments are just geared towards improving the clarity of the paper. Technically, in every instance where the authors apply Danskin's theorem, it was not really clear what form of Danskin's theorem was being used, and therefore it was difficult to follow the derivation.
Theoretical Analysis of Adversarial Learning: A Minimax Approach
In this paper, we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be transformed into a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification and popular loss functions including the hinge loss and ramp loss. As some illustrative examples, we derive the adversarial risk bounds for SVMs and deep neural networks, and our bounds have two data-dependent terms, which can be optimized for achieving adversarial robustness.
Distributionally Robust Learning in Heterogeneous Contexts
Osama, Muhammad, Zachariah, Dave, Stoica, Petre
We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts. We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach. The proposed method is computationally feasible and provides statistical guarantees. We demonstrate its performance using both real and synthetic data.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
Theoretical Analysis of Adversarial Learning: A Minimax Approach
Tu, Zhuozhuo, Zhang, Jingwei, Tao, Dacheng
In this paper, we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be transformed into a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification and popular loss functions including the hinge loss and ramp loss.
A Minimax Approach to Supervised Learning
Given a task of predicting Y from X, a loss function L, and a set of probability distributions Gamma on (X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Gamma? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models as well as some popular regularization schemes. For quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models. Moreover, for the 0-1 loss we derive a classifier which we call the minimax SVM.